智能论文笔记

PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking

Yixuan Qiao , Hao Chen , Jun Wang , Yongquan Lai , Tuozhen Liu , Xianbin Ye , Xin Tang , Rui Fang , Peng Gao , Wenfeng Xie

分类：自然语言处理

2022-05-18

This paper describes the PASH participation in TREC 2021 Deep Learning Track. In the recall stage, we adopt a scheme combining sparse and dense retrieval method. In the multi-stage ranking phase, point-wise and pair-wise ranking strategies are used one after another based on model continual pre-trained on general knowledge and document-level data. Compared to TREC 2020 Deep Learning Track, we have additionally introduced the generative model T5 to further enhance the performance.

translated by 谷歌翻译

CandidateDrug4Cancer: An Open Molecular Graph Learning Benchmark on Drug Discovery for Cancer

Xianbin Ye , Ziliang Li , Fei Ma , Zongbi Yi , Pengyong Li , Jun Wang , Peng Gao , Yixuan Qiao , Guotong Xie

分类：机器学习

2022-03-02

抗癌药物的发现是偶然的，我们试图介绍开放的分子图学习基准，称为Cantidrug4cancer，这是一个具有挑战性且逼真的基准数据集，可促进可扩展，健壮和可重复的图形机器学习用于抗癌药物发现的机器学习研究。候选物4CANCER数据集涵盖了多个最多的癌症靶标，涵盖了54869个与癌症相关的药物分子，其范围从临床前，临床和FDA批准的范围内。除了构建数据集外，我们还使用描述符和表达性图神经网络进行了有效的药物靶点相互作用（DTI）预测基准的基准实验。实验结果表明，候选物4Cancer在实际应用中对学习分子图和目标提出了重大挑战，这表明将来有机会开发用于治疗癌症的候选药物的研究。

translated by 谷歌翻译

Revisiting Open World Object Detection

Xiaowei Zhao , Xianglong Liu , Yifan Shen , Yuqing Ma , Yixuan Qiao , Duorui Wang

分类：计算机视觉

2022-01-03

打开世界对象检测（OWOD），模拟知识持续增长的真正动态世界，试图检测已知和未知的类别，并逐步学习所识别的未知组。我们发现，尽管以前的欧瓦德工作建设性地提出了OWOD定义，但实验设置与不合逻辑的基准，令人困惑的度量计算和不当方法是不合理的。在本文中，我们重新思考OWOD实验环境，并提出了五项基本基准原则，以指导OWOD基准建设。此外，我们设计了两个特定于OWOD问题的公平评估协议，从未知课程的角度填充了评估的空白。此外，我们介绍了一个新颖且有效的OWOD框架，其中包含辅助提案顾问（PAD）和特定于类驱逐分类器（CEC）。非参数垫可以帮助RPN识别无需监控的准确未知提案，而CEC通过特定于类的驱逐函数校准过自信的激活边界并滤除令人困惑的预测。在我们的公平基准上进行的综合实验表明，我们的方法在现有的和我们的新指标方面表明了其他最先进的对象检测方法。\脚注{我们的基准和代码可在https://github.com提供/重新驱动/重新驱动。

translated by 谷歌翻译

Superpixel-Based Building Damage Detection from Post-earthquake Very High Resolution Imagery Using Deep Neural Networks

Jun Wang , Zhoujing Li , Yixuan Qiao , Qiming Qin , Peng Gao , Guotong Xie

分类：计算机视觉

2021-12-09

在像地震等自然灾害后建立损伤检测对于启动有效的应急行动至关重要。远程感测的非常高空间分辨率（VHR）图像可以提供由于它们具有高几何精度的受影响建筑物的能力而提供重要信息。已经开发出许多方法来检测由于地震因地震而受损的建筑物。但是，使用深神经网络（DNN）已经支付了利用VHR图像中所代表的丰富的功能。本文提出了一种基于DNN和改进的分段方法的新型超像素的方法，从VHR图像中检测损坏的建筑物。首先，扩展了修改的快速扫描和自适应合并方法以创建初始过分分割。其次，基于相邻图（RAG）的区域合并段，被认为是由局部二进制模式（LBP）纹理，光谱和形状特征组成的改进的语义相似性标准。第三，呈现了使用堆叠的去噪自动编码器的预训练的DNN，称为SDAE-DNN，以利用丰富的语义特征来构建损坏检测。 SDAE-DNN的深层特征抽象可以通过学习更多内在和鉴别特征来提高检测精度，这使得使用最先进的替代分类器的其他方法表现优于其他方法。我们展示了我们在尼泊尔Bhaktapur的复杂城市地区使用WorldView-2图像的方法的可行性和有效性，这是受2015年4月25日的尼泊尔地震影响的。

translated by 谷歌翻译

Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks

Pengyong Li , Jun Wang , Ziliang Li , Yixuan Qiao , Xianggen Liu , Fei Ma , Peng Gao , Seng Song , Guotong Xie

分类：机器学习 | 人工智能

2021-10-26

自我监督的学习逐渐被出现为一种强大的图形表示学习技术。然而，在图表数据上进行可转换，概括和强大的表示学习仍然是对预训练图形神经网络的挑战。在本文中，我们提出了一种简单有效的自我监督的自我监督的预训练策略，命名为成对半图歧视（PHD），明确地预先在图形级别进行了图形神经网络。 PHD被设计为简单的二进制分类任务，以辨别两个半图是否来自同一源。实验表明，博士学位是一种有效的预训练策略，与最先进的策略相比，在13图分类任务上提供了可比或优越的性能，并在与节点级策略结合时实现了显着的改进。此外，所学习代表的可视化透露，博士策略确实赋予了模型来学习像分子支架等图形级知识。这些结果已将博士学位作为图形级别代表学习中的强大有效的自我监督的学习策略。

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

Policy Pre-training for End-to-end Autonomous Driving via Self-supervised Geometric Modeling

Penghao Wu , Li Chen , Hongyang Li , Xiaosong Jia , Junchi Yan , Yu Qiao

分类：计算机视觉

2023-01-03

Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.

translated by 谷歌翻译

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Jie Liu , Yixiao Zhang , Jie-Neng Chen , Junfei Xiao , Yongyi Lu , Bennett A. Landman , Yixuan Yuan , Alan Yuille , Yucheng Tang , Zongwei Zhou

分类：计算机视觉 | 机器学习

2023-01-02

An increasing number of public datasets have shown a marked clinical impact on assessing anatomical structures. However, each of the datasets is small, partially labeled, and rarely investigates severe tumor subjects. Moreover, current models are limited to segmenting specific organs/tumors, which can not be extended to novel domains and classes. To tackle these limitations, we introduce embedding learned from Contrastive Language-Image Pre-training (CLIP) to segmentation models, dubbed the CLIP-Driven Universal Model. The Universal Model can better segment 25 organs and 6 types of tumors by exploiting the semantic relationship between abdominal structures. The model is developed from an assembly of 14 datasets with 3,410 CT scans and evaluated on 6,162 external CT scans from 3 datasets. We rank first on the public leaderboard of the Medical Segmentation Decathlon (MSD) and achieve the state-of-the-art results on Beyond The Cranial Vault (BTCV). Compared with dataset-specific models, the Universal Model is computationally more efficient (6x faster), generalizes better to CT scans from varying sites, and shows stronger transfer learning performance on novel tasks. The design of CLIP embedding enables the Universal Model to be easily extended to new classes without catastrophically forgetting the previously learned classes.

translated by 谷歌翻译

A Multi-Source Information Learning Framework for Airbnb Price Prediction

Lu Jiang , Yuanhan Li , Na Luo , Jianan Wang , Qiao Ning

分类：机器学习

2023-01-01

With the development of technology and sharing economy, Airbnb as a famous short-term rental platform, has become the first choice for many young people to select. The issue of Airbnb's pricing has always been a problem worth studying. While the previous studies achieve promising results, there are exists deficiencies to solve. Such as, (1) the feature attributes of rental are not rich enough; (2) the research on rental text information is not deep enough; (3) there are few studies on predicting the rental price combined with the point of interest(POI) around the house. To address the above challenges, we proposes a multi-source information embedding(MSIE) model to predict the rental price of Airbnb. Specifically, we first selects the statistical feature to embed the original rental data. Secondly, we generates the word feature vector and emotional score combination of three different text information to form the text feature embedding. Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding. Finally, this paper combines the three modules into multi source rental representations, and uses the constructed fully connected neural network to predict the price. The analysis of the experimental results shows the effectiveness of our proposed model.

translated by 谷歌翻译

Yuille-Poggio's Flow and Global Minimizer of polynomials through convexification by Heat Evolution

Qiao Wang

分类：计算机视觉

2023-01-01

In this paper, we investigate the possibility of the backward-differential-flow-like algorithm which starts from the minimum of convexification version of the polynomial. We apply the heat evolution convexification approach through Gaussian filtering, which is actually an accumulation version of Steklov's regularization. We generalize the fingerprint theory which was proposed in the theory of computer vision by A.L. Yuille and T. Poggio in 1980s, in particular their fingerprint trajectory equation, to characterize the evolution of minimizers across the scale. On the other hand, we propose the "seesaw" polynomials $p(x|s)$ and we find a seesaw differential equation $\frac{\partial p(x|s)}{\,ds}=-\frac{1}{p''(x)}$ to characterize the evolution of global minimizer $x^*(s)$ of $p(x|s)$ while varying $s$. Essentially, both the fingerprints $\mathcal{FP}_2$ and $\mathcal{FP}_3$ of $p(x)$, consisting of the zeros of $\frac{\partial^2 p(x,t)}{\partial x^2}$ and $\frac{\partial^3 p(x,t)}{\partial x^3}$, respectively, are independent of seesaw coefficient $s$, upon which we define the Confinement Zone and Escape Zone. Meanwhile, varying $s$ will monotonically condition the location of global minimizer of $p(x|s)$, and all these location form the Attainable Zone. Based on these concepts, we prove that the global minimizer $x^*$ of $p(x)$ can be inversely evolved from the global minimizer of its convexification polynomial $p(x,t_0)$ if and only if $x^*$ is included in the Escape Zone. In particular, we give detailed analysis for quartic and six degree polynomials.

translated by 谷歌翻译